| Variable | Distribution | Mean.Rate | SD.Coef | K.S.p.value |
|---|---|---|---|---|
| Assets (raw) | Normal | 36727 | 57935 | 2.81e-131 |
| Price Exposure | Normal | 0.056 | 0.132 | 3.22e-88 |
| Credit Rate | Logistic model | 25.7% | 3.56e-07 | — |
| Enterprise Prevalence | Empirical | 25.7% | — | — |
Household Enterprise Entry as a Coping Mechanism Under Agricultural Price Shocks: An Agent-Based Model Calibrated to LSMS-ISA Panel Data
This paper presents an agent-based model (ABM) investigating household enterprise entry as a coping mechanism in response to agricultural price shocks in Sub-Saharan Africa. Using household panel data from the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA) for Tanzania and Ethiopia, we calibrate distributional parameters for assets, price exposure, and credit access. The model implements household agents that make enterprise participation decisions based on rule-based policies, with an LLM-augmented decision architecture designed for future evaluation. Through parameter sweeps and behavior exploration across calibrated synthetic households, we examine the sensitivity of enterprise dynamics to policy thresholds. Our findings demonstrate that the ABM reproduces key patterns observed in the empirical data, including enterprise prevalence trends and path-dependent household trajectories. We discuss the model’s limitations, including the absence of agent-agent interactions and the exploratory nature of the current computational analysis, and outline directions for more rigorous validation.
1 Introduction
Agricultural price volatility poses significant risks to rural households in developing economies. When cash crop prices decline, households face reduced income and may adopt various coping strategies to smooth consumption (Dercon 2002). One such strategy is enterprise entry—the initiation of non-farm business activities as an alternative income source. Understanding the dynamics of enterprise entry under price shocks has important implications for rural development policy and poverty reduction strategies.
This paper develops an agent-based model (ABM) to investigate the relationship between agricultural price shocks and household enterprise participation in Sub-Saharan Africa. The model is calibrated to household panel data from the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA) for Tanzania (2008-2014) and Ethiopia (2011-2015) (World Bank 2024).
Research Question: Do negative agricultural price shocks induce enterprise entry as a coping mechanism, and how do household characteristics (assets, credit access) mediate this relationship?
Contributions: This work makes three contributions to the literature:
Calibrated Microsimulation: We develop a generative microsimulation approach where distributional parameters (assets, shocks, credit) are fitted from empirical data, enabling synthetic panel generation that preserves key statistical properties.
Pattern-Oriented Validation: Following Pattern-Oriented Modeling principles (Grimm et al. 2010; Railsback and Grimm 2019), we validate the model against multiple empirical patterns rather than single metrics.
LLM-Augmented Policy Architecture: We design (but do not yet execute) an architecture for LLM-based household decision-making, contributing to the emerging literature on AI-augmented social simulation.
Scope and Limitations: The current model does not include direct agent-agent interactions; households respond independently to shared exogenous price shocks. Aggregate patterns emerge from heterogeneous individual responses, not from complex adaptive dynamics. This design choice reflects the empirical setting where household enterprise decisions are primarily driven by household-level factors rather than social network effects.
3 Data
3.1 LSMS-ISA Panel Data
The model is calibrated using the Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA) harmonized panel data (World Bank 2024). We use:
- Tanzania: 4 waves (2008-2014), N=500 households
- Ethiopia: 3 waves (2011-2015), N=500 households
The data include household-level information on enterprise participation, asset holdings, credit access, and agricultural production. Price exposure is computed from household crop portfolios and regional price indices.
3.2 Data Processing and Derived Variables
Derived target variables include:
- Enterprise indicator: Binary (0/1) for non-farm enterprise operation
- Asset index: Standardized index of durable assets
- Credit access: Binary indicator for formal credit access
- Price exposure: Weighted average of crop-specific price changes
See docs/DATA_CONTRACT.md for the full schema specification and docs/DATA_AUDIT.md for provenance documentation.
3.3 Calibration Artifact
Distribution parameters are fitted from the LSMS data and stored in a calibration artifact (artifacts/calibration/tanzania/calibration.json). Key fitted distributions:
Note on K-S Tests: The Kolmogorov-Smirnov tests reject the null hypothesis of distributional fit. This is expected for heavy-tailed economic data with large sample sizes (Clauset, Shalizi, and Newman 2009). Visual inspection (QQ-plots in Appendix) and moment-matching suggest the normal approximation captures the central tendency adequately for our simulation purposes, though future work should explore alternative distributional families.
4 Model
4.1 Overview (ODD Protocol)
The model follows the ODD+D protocol (Grimm et al. 2020). This section provides a summary; the full ODD description is in Appendix A.
4.1.1 Purpose
The model investigates the relationship between agricultural price shocks and household enterprise entry, with heterogeneous responses by asset holdings and credit access. Households are classified as “stayers” (persistent entrepreneurs, >50% waves) or “copers” (intermittent responders).
4.1.2 Entities, State Variables, and Scales
Primary Entity: HouseholdAgent
| Variable | Type | Description |
|---|---|---|
household_id |
string | Unique identifier |
wave |
int | Survey wave (1-4) |
assets |
float | Standardized asset index |
credit_access |
int | Binary credit access |
enterprise_status |
int | Binary enterprise status |
price_exposure |
float | Price shock exposure |
classification |
string | stayer/coper/none |
Temporal Scale: Discrete time steps corresponding to survey waves (~2-year intervals).
Spatial Scale: No explicit spatial structure; agents respond independently to shared price distributions.
4.1.3 Process Overview
Each simulation step:
- Environment update: Price shock distribution for current wave
- Agent activation: All agents in random order
- Decision: Each agent queries policy for action
- State update: Agent applies action (ENTER/EXIT/NO_CHANGE)
- Data collection: Outcomes recorded
4.2 Design Concepts
4.2.1 Heterogeneity Without Interaction
Agents differ in assets, credit access, and initial enterprise status. These differences produce heterogeneous responses to common exogenous shocks. However, agents do not directly interact; there are no network effects, peer influences, or market feedback loops in the current implementation.
This design choice reflects our focus on household-level coping decisions rather than social contagion or general equilibrium effects. Aggregate patterns (enterprise prevalence, classification distributions) arise from the aggregation of heterogeneous individual responses (Epstein 2008).
4.2.2 Stochasticity
Sources of stochasticity include: - Price shock draws from calibrated distributions - Asset initialization (synthetic mode) - Agent activation order (Mesa RandomActivation)
Centralized RNG with recorded seeds ensures reproducibility.
4.3 Decision Policies
4.3.1 Rule-Based Policy (Executed)
The baseline RulePolicy implements deterministic threshold-based decisions:
- Enter:
price_exposure < price_thresholdANDassets > asset_thresholdAND NOT currently in enterprise - Exit:
assets < exit_thresholdAND currently in enterprise - No Change: Otherwise
4.3.2 LLM-Augmented Policy (Design Only)
We have designed a MultiSampleLLMPolicy architecture for future evaluation. Key features:
- K samples at temperature T (default: K=5, T=0.6)
- Constraint validation for feasibility
- Majority vote aggregation with conservative tie-break
- State-based caching for reproducibility
This policy has not been executed. All empirical results in this paper use the rule-based policy. See Discussion for planned LLM evaluation.
4.4 Architecture
flowchart TB
subgraph DataPipeline["Data Pipeline"]
LSMS["LSMS-ISA Data"]
ETL["ETL Module"]
Calibrate["Calibration"]
CalJSON["calibration.json"]
end
subgraph ABMCore["ABM Core (Mesa 3)"]
SynthGen["Synthetic Panel Generator"]
Model["EnterpriseCopingModel"]
Agents["HouseholdAgents"]
Policy["RulePolicy"]
end
subgraph Outputs["Outputs"]
Outcomes["household_outcomes.parquet"]
Manifest["manifest.json"]
end
LSMS --> ETL
ETL --> Calibrate
Calibrate --> CalJSON
CalJSON --> SynthGen
SynthGen --> Model
Model --> Agents
Agents --> Policy
Policy --> Outcomes
Model --> Manifest
5 Experimental Design
5.1 Baseline Scenario
The baseline scenario uses LSMS-derived household data with the RulePolicy:
- Country: Tanzania
- N: 500 households
- Waves: 4 (matching LSMS structure)
- Seed: 42
- Policy: RulePolicy with default thresholds
5.2 Parameter Sweeps
We conducted parameter sweeps over calibrated synthetic households to examine sensitivity:
- Grid: 6×6 (price_threshold × asset_threshold)
- Price threshold range: [-0.3, 0.0]
- Asset threshold range: [-1.0, 1.0]
- Seeds per cell: 2
- N per run: 100 households
- Total runs: 72
Data source: Calibrated synthetic (SyntheticPanelGenerator with CalibrationArtifact)
5.3 Behavior Exploration
We conducted random search over parameter space:
- Candidates: 40
- Seeds per candidate: 2
- Objective: MSE between simulated and LSMS-derived enterprise rates
Target enterprise rates (LSMS-derived):
- Wave 1: 19.6%
- Wave 2: 26.0%
- Wave 3: 28.6%
- Wave 4: 28.4%
6 Results
6.1 Pattern Validation (LSMS-Derived Baseline)
6.1.1 Enterprise Prevalence
6.1.2 Household Classification
6.1.3 Path Dependence
| Transition Probabilities (Current → Next Wave) | ||
| In Enterprise | Not in Enterprise | |
|---|---|---|
| In Enterprise | 100.0% | 0.0% |
| Not in Enterprise | 0.0% | 100.0% |
6.2 Parameter Sensitivity (Calibrated Synthetic)
6.2.1 Policy Threshold Heatmap
6.2.2 Behavior Exploration Results
| ID | Price Thresh | Asset Thresh | Exit Thresh | Objective (MSE) |
|---|---|---|---|---|
| 6 | 0.014 | 0.395 | -0.484 | 0.0017 |
| 24 | 0.027 | -0.798 | -1.883 | 0.0017 |
| 15 | 0.002 | -0.338 | -1.423 | 0.0022 |
| 33 | 0.081 | 1.226 | -0.601 | 0.0023 |
| 8 | -0.011 | -0.916 | -1.067 | 0.0028 |
7 Robustness and Diagnostics
7.1 Multi-Seed Variability
Interpretation: With 10 seeds, the coefficient of variation (CV) for enterprise rate is less than 5%, indicating stable aggregate outcomes. However, 10 seeds is insufficient for robust inference; publication-quality analysis should use 30-100+ replications (Bankes 1993).
7.2 Regression Analysis
| Term | Coefficient | Std..Error | t.statistic | p.value | Sign.Match |
|---|---|---|---|---|---|
| price_exposure | 0.0000 | NaN | NaN | NaN | No |
8 Limitations and Future Work
8.1 Current Limitations
8.1.1 Model Architecture
No Agent Interactions: Households respond independently to exogenous shocks. Social network effects, peer influence, and local market feedback are not modeled.
Exogenous Price Shocks: Prices are drawn from calibrated distributions without feedback from enterprise activity. No general equilibrium effects.
Binary Outcomes: Enterprise status is 0/1; enterprise type, size, and profitability are not modeled.
8.1.2 Calibration Limitations
Heavy-Tailed Distributions: K-S tests reject normality for asset distributions. Future work should explore lognormal, Pareto, or generalized extreme value distributions (Clauset, Shalizi, and Newman 2009).
Copula Dependence: The Gaussian copula captures linear dependence but may miss tail dependence important for extreme events.
8.1.3 Computational Limitations
Exploratory Scale: Current analysis uses 10 seeds for robustness and 2 seeds per sweep cell. Publication-quality analysis requires 30-100+ replications.
Behavior Search: 40 candidates with 2 seeds each is exploratory, not optimization. Results identify promising parameter regions but should not be interpreted as optimal configurations.
8.1.4 LLM Policy Status
The LLM-augmented policy has not been executed. All results use the rule-based baseline. LLM policy sections describe the implemented architecture and planned evaluation approach, not empirical findings.
8.2 Future Work
LLM Policy Evaluation: Execute the
MultiSampleLLMPolicyand compare to rule-based baseline and ML benchmarks.Cross-Country Validation: Test model calibrated on Tanzania against Ethiopia data.
Extended Replications: Increase seed count to 50-100 for robust inference.
Alternative Distributions: Fit heavy-tailed distributions (lognormal, Pareto) for assets.
Agent Interactions: Introduce network effects and local market dynamics.
9 Conclusion
This paper presents an agent-based model of household enterprise entry as a coping mechanism under agricultural price shocks, calibrated to LSMS-ISA panel data from Tanzania and Ethiopia. The model successfully reproduces key empirical patterns including enterprise prevalence trends, household classification distributions, and path-dependent trajectories.
Our contribution is methodological: we demonstrate a generative microsimulation approach where distributional parameters are fitted from empirical data, enabling systematic exploration of parameter sensitivity through calibrated synthetic panels. The approach maintains clear provenance and reproducibility through manifest tracking and centralized random number generation.
We emphasize that current results are exploratory. The model lacks agent interactions that would be required for claims about complex adaptive dynamics. The LLM-augmented decision policy, while fully implemented, has not been executed due to computational constraints. Future work will address these limitations through expanded replication, alternative distributional families, and LLM policy evaluation.
The model provides a foundation for policy analysis of interventions targeting household enterprise coping strategies, including credit access expansion, price stabilization programs, and asset transfer schemes. Such analysis requires careful attention to the model’s epistemic boundaries and the distinction between model outputs and real-world predictions.
10 References
11 Appendix A: Full ODD Description
See docs/abm_report.qmd for the complete ODD+D protocol description, including:
- Detailed purpose and scope
- Complete entity state variables
- Process scheduling pseudocode
- Design concepts (adaptation, learning, sensing, interaction, stochasticity)
- Submodel specifications
- Code references with line numbers
12 Appendix B: Data Provenance
12.1 Data Source Classification
| Classification | Code | Description |
|---|---|---|
| LSMS-derived | lsms |
Uses load_derived_targets() from processed LSMS |
| Calibrated synthetic | calibrated |
Uses SyntheticPanelGenerator with CalibrationArtifact |
12.2 Figure/Table Data Sources
| Figure/Table | Data Source | Path |
|---|---|---|
| Figure 2 | LSMS-derived | outputs/tanzania/baseline/ |
| Figure 3 | LSMS-derived | outputs/tanzania/baseline/ |
| Table 3 | LSMS-derived | outputs/tanzania/baseline/ |
| Figure 4 | Calibrated synthetic | outputs/sweeps/calibrated/ |
| Table 4 | Calibrated synthetic | outputs/search/calibrated/ |
| Figure 5 | LSMS-derived | outputs/batch/lsms/ |
12.3 Calibration Artifact
- Path:
artifacts/calibration/tanzania/calibration.json - Git commit: See artifact
git_commitfield - Created: See artifact
created_atfield
13 Appendix C: Reproduction Commands
See docs/REPRODUCIBILITY.md for complete environment setup and command reference.
Quick Start:
# Setup
make setup # Python dependencies
make setup-r # R dependencies
# Calibration
abm calibrate --country tanzania --data-dir data/processed
# Baseline simulation
make run-sim COUNTRY=tanzania
# Parameter sweep (calibrated)
python3 scripts/run_sweep.py --calibration artifacts/calibration/tanzania/calibration.json
# Behavior search (calibrated)
python3 scripts/run_behavior_search.py --calibration artifacts/calibration/tanzania/calibration.json --targets-from-lsms
# Render this document
quarto render docs/paper.qmd --to htmlDocument generated: 2026-01-14 Repository: abm-enterprise-coping